class: center, middle, inverse, title-slide .title[ # ISA 444/544: Business Forecasting ] .subtitle[ ## 01: Introduction to Time Series Analysis and Forecasting ] .author[ ###
Fadel M. Megahed, PhD
Raymond E. Glos Professor in Business
Farmer School of Business
Miami University
@FadelMegahed
fmegahed
fmegahed@miamioh.edu
Automated Scheduler for Office Hours
] .date[ ### Fall 2025 ] --- # Learning Objectives for Today's Class - Describe the role of forecasting in business. - Describe the key components of a **time-series** (**trend**, **seasonality**, **multiple seasonality**, and **cycles**). - Explain the concept of data-generating process (DGP) - Discuss limits of forecasting - Understand key forecasting terminology --- class: inverse, center, middle # Course Motivation --- ## Forecasting Impacts Everything and Everyone .pull-left[ - **Businesses**: Sales forecasts set revenue targets, inventory projections optimize supply chains, and staffing plans ensure workforce readiness during demand fluctuations. - **Gov.**: Tax revenue and social programs forecasts aid budgeting and resource allocation. - **Individuals**: Financial forecasts support budgeting, saving, and retirement planning. - **Weather**: Forecast inform agriculture, disaster prep, and daily decisions. ] .pull-right[ <center> <br> <img src="data:image/png;base64,#../../figures/60870-2.png" alt="CBO Baseline Budget Projections" style="width: auto; height: auto" /> .font80[Source: [Congressional Budget Office](https://www.cbo.gov/publication/61172#_idTextAnchor040)] <br><br><br> <a class="weatherwidget-io" href="https://forecast7.com/en/39d47n84d75/45056/" data-label_1="OXFORD" data-label_2="WEATHER" data-theme="original" >OXFORD WEATHER</a> <script> !function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0];if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src='https://weatherwidget.io/js/widget.min.js';fjs.parentNode.insertBefore(js,fjs);}}(document,'script','weatherwidget-io-js'); </script> .font80[Source: [Weather Widget's Forecast](https://weatherwidget.io/)] </center> ] --- ## Microsoft's Missed Opportunity with Mobile Phones .pull-left-2[ .font90[ > **Q**: People get passionate when Apple comes out with something new—the iPhone; of course ... **Is that something you'd want them to feel about Microsoft?** <br> > **A**: It's sort of a funny question. **Would I trade 96% of the market for 4%?** *(Laughter.)* > **I want to have products that appeal to everybody.** There's no chance that the iPhone is going to get any significant market share. **No chance.** It's a $500 subsidized item. They may make a lot of money. > But if you actually take a look at the 1.3 billion phones that get sold, **I'd prefer to have our software in 60% or 70% or 80% of them**, than I would **2% or 3%, which is what Apple might get**. ] ] .pull-right-2[ .center[  .font90[ Steve Ballmer, Former CEO of Microsoft in his infamous interview with [USA Today 2007](https://web.archive.org/web/20070502033654/https://www.usatoday.com/money/companies/management/2007-04-29-ballmer-ceo-forum-usat_N.htm) ] ] ] --- ## Microsoft's Missed Opportunity: Back of a Napkin Calc. .pull-left[ .font90[ - Since Q1 2009, **Windows' Mobile OS's market share <= 2.5%**, and is now at 0.02% ([StatCounter](https://gs.statcounter.com/os-market-share/mobile/worldwide/#quarterly-200901-202403)). - **Apple's Mobile iOS market share > 19%**, and is now at **27.69%** ([StatCounter](https://gs.statcounter.com/os-market-share/mobile/worldwide/#quarterly-200901-202403)). - Apple's **iPhone revenues** from 2007 to 2024 was **$2.037 trillion** (per [statista](https://www.statista.com/statistics/263402/apples-iphone-revenue-since-3rd-quarter-2007/)). - Assuming Microsoft could have captured just **5%** of Apple's market revenue `\(\rightarrow\)` **$102 billion**. - This estimate **excludes app store and brand value**, which will make the missed opportunity even larger. ] ] .pull-right[ <img src="data:image/png;base64,#01_intro_files/figure-html/market_share-1.png" width="576" style="display: block; margin: auto;" /> <div style="margin-top:-30px;"> <img src="data:image/png;base64,#01_intro_files/figure-html/apple_revenue-3.png" width="576" style="display: block; margin: auto;" /> </div> ] --- ## Other Real-World Forecasting Failures in Tech .pull-left[ .center[ <img src="data:image/png;base64,#../../figures/1024px-Ibm_px_xt_color.jpg" alt="IBM PC" style="width: auto; height: 190px" /> .font80[**IBM**: Missed the PC revolution.] <img src="data:image/png;base64,#../../figures/1024px-Asahi_Pentax_S3_with_film.jpg" alt="Kodak Camera" style="width: auto; height: 190px;" /> .font80[**Kodak**: Missed the digital camera revolution.] ] ] .pull-right[ .center[ <img src="data:image/png;base64,#../../figures/The_Last_Blockbuster_storefront.jpg" alt="Blockbuster" style="width: auto; height: 190px;" /> .font80[**Blockbuster**: Missed the streaming revolution.] <img src="data:image/png;base64,#../../figures/Ysearch_2005.png" alt="Yahoo Logo" style="width: auto; height: 190px;" /> .font80[**Yahoo**: Missed the search engine revolution.] ] ] --- ## Non-Tech Failures: Red Lobster's Endless Shrimp .center[ <img src="data:image/png;base64,#../../figures/red_lobster.png" alt="Red Loster endless shrimp deal was too popular; a key reason for the company's 11 million dollar loss in the third quarter of 2023" style="width: auto; height: 525px;" /> ] --- ## Non-Tech Failures: Target's Overestimation .center[ <img src="data:image/png;base64,#../../figures/target.png" alt="Target Stock Plunges 21% on Weak Sales Ahead of Holiday Season" style="width: auto; height: 525px;" /> ] --- ## Why Do These Stories Matter? **(1) Forecasting Errors = Real Money Lost** - **Microsoft**: $102 billion in potential mobile phone revenues. - **Target:** .bold[Overestimation] leads to unsold stock or overservicing (e.g., [Target's Stock Plunging by 21% due to lower profit and larger inventories](https://www.nytimes.com/2024/11/20/business/target-earnings-holiday-shopping.html?unlocked_article_code=1.sU4.BN7Y.DRvnq7F_52NO&smid=url-share)). - **Red Lobster**: $11M in losses due to .bold[underestimating] demand for their endless shrimp deal. <br> **(2) Course Relevance:** This class will teach you how to identify .black[.bold[trends]], .black[.bold[seasonality]], and .black[.bold[cycles]], and .black[.bold[how to apply forecasting tools and models]] so you can **avoid these pitfalls** in your future roles. The goal is to allow you to make **data-driven forecasts**, not just gut-based decisions, and more importantly, be able to **quantify the uncertainty** in your forecasts. --- ## Can you Avoid Common Forecasting Mistakes?
−
+
08
:
00
.panelset[ .panel[.panel-name[Description] - **Scenario**: You are a business analyst at a hotel chain. - **Problem**: You are tasked with forecasting hotel occupancy for the next 11 months. - **Data**: You have access to the hotel's monthly room occupancy data from 2022 to January of 2025. Download the file [here]https://miamioh.instructure.com/courses/240425/files/36340085?module_item_id=6161473). - **Task:** .black[.bold[Without the use of any AI tools]], create a forecast for the hotel's occupancy for the next 11 months. The forecast can be made in Excel, R, or Python. .black[.bold[Document the process and the rationale behind your forecasts.]] - **Non-graded Class Activity:** Input your logic for the 11-month forecast, and your quantitative forecast for **February 2025** in the next 2 tabs, respectively. ] .panel[.panel-name[Your Logic] Use the editable text-box below to describe your logic for forecasting the hotel's occupancy for the next 11 months. Use bullet points to list your steps. .can-edit.key-activity1_logic[ **Steps Taken to Generate the Forecasts:** .font70[(Insert below)] - Edit me - ... - ... ] ] .panel[.panel-name[Your Sol] - Input your solution for **February 2025** by using the QR code below. <img src="data:image/png;base64,#../../figures/qr_code_activity01.png" width="30%" style="display: block; margin: auto;" /> ] .panel[.panel-name[Class Results] <div style='position: relative; padding-bottom: 56.25%; padding-top: 35px; height: 0; overflow: hidden;'><iframe sandbox='allow-scripts allow-same-origin allow-presentation' allowfullscreen='true' allowtransparency='true' frameborder='0' height='285' src='https://www.mentimeter.com/app/presentation/alrufhucrqefifc266ma7ormoe6xntmw/embed' style='position: absolute; top: 0; left: 0; width: 100%; height: 100%;' width='420'></iframe></div> ] .panel[.panel-name[Fadel's Logic] <details> <summary>Let us discuss how to approach such a problem. </summary> <img src="../../figures/sim_hotel_occupancy.png" alt="Fadel's Logic"> </details> .can-edit.key-activity1_viz[ **Fadel's approach to this problem:** .font70[(To be discussed in class)] - Edit me - ... - ... ] ] .panel[.panel-name[Fadel's Sol] <details> <summary>Let us think of a reasonable solution, which assumes no prior time-series knowledge. </summary> <small> ```{python import pandas as pd df_hotel = ( pd.read_csv("../../data/01_hotel_occupancy.csv") .assign(date = lambda x: pd.to_datetime(x['date']), year = lambda x: x['date'].dt.year ) ) # plotting of the data hidden for brevity # compute the average occupancy per year and year-over-year growth rate yearly_avg = df_hotel.groupby('year')['hotel_occupancy'].mean().drop(2025) yoy_growth = yearly_avg.pct_change().mean() monthly_2024 = df_hotel.query("year == 2024") # filter the data for 2024 # forecast the occupancy for 2025 forecast = pd.DataFrame({ "date": pd.date_range(start="2025-01-01", end="2025-12-01", freq="MS"), "forecasted_occupancy": (monthly_2024['hotel_occupancy'] * (1 + yoy_growth) ).round(0).astype(int) }) forecast.iloc[0:2].reset_index(drop=True) </small> </details> ``` ] ] --- class: inverse, center, middle # Types of Data Over Time and the Components of a Time Series --- ## Cross-Sectional Data **Cross-sectional data** captures multiple variables at a single point in time for each observation; e.g., .bold[all the variables within a given observation] in the [DoL's LCA Disclosure Data for 2025 Q3](https://www.dol.gov/sites/dolgov/files/ETA/oflc/pdfs/LCA_Disclosure_Data_FY2025_Q3.xlsx) were collected simultaneously. .font70[
] --- ## Time Series Data **Time series data** captures a single variable at multiple points in time; e.g., the [daily stock prices for Apple](https://finance.yahoo.com/quote/AAPL/history?p=AAPL) or our **simulated monthly hotel room occupancy** dataset. <img src="data:image/png;base64,#01_intro_files/figure-html/time_series-1.png" width="1440" style="display: block; margin: auto;" /> --- ## Panel Data **Panel data** captures multiple variables at multiple points in time for each observation; e.g., the [Panel Study of Income Dynamics](https://psidonline.isr.umich.edu/) or the [World Bank's World Development Indicators](https://databank.worldbank.org/source/world-development-indicators). .font80[
] .footnote[ <html> <hr> </html> **Source:** Data queried from the [World Bank Data](https://datacatalog.worldbank.org/) using the [wbstats](https://cran.r-project.org/web/packages/wbstats/wbstats.pdf)
in R. The printed results show a snapshot of 7 variables (out of a much larger panel dataset). You can think of panel data as a cross-sectional dataset with a longitudinal/time component. ] --- ## Components of Time Series Data: Trend - **Trend**: A long-term increasing or decreasing pattern over time. - **Example**: The [US GDP](https://fred.stlouisfed.org/series/GDP) has a long-term upward trend. <img src="data:image/png;base64,#01_intro_files/figure-html/trend-1.png" style="display: block; margin: auto;" /> --- ## Components of Time Series Data: Seasonality **Seasonality** refers to the property of a time series that displays REGULAR patterns that repeat at a constant frequency (*m*). For example, the [number of retail trade workers](https://fred.stlouisfed.org/series/CEU4200000001) has a seasonal pattern (with an upward trend). <img src="data:image/png;base64,#01_intro_files/figure-html/ecommerce-1.png" style="display: block; margin: auto;" /> --- ## Components of Time Series Data: Multiple Seasonality **Multiple seasonality** refers to the property of a time series that displays multiple seasonal patterns that repeat at different frequencies. <img src="data:image/png;base64,#01_intro_files/figure-html/multiple_seasonality-1.png" width="1440" style="display: block; margin: auto;" /> --- ## Components of Time Series Data: Cycles **Cycles** refer to the property of a time series that displays irregular patterns that repeat at irregular frequencies. For example, the [US Total Vehicle Sales](https://fred.stlouisfed.org/series/TOTALSA) has business cycles that are influenced by economic conditions and advancements in vehicle technologies. <img src="data:image/png;base64,#01_intro_files/figure-html/cycles-3.png" style="display: block; margin: auto;" /> --- ## Components of Time Series Data: Cycles **Cycles** refer to the property of a time series that displays irregular patterns that repeat at irregular frequencies. For example, the [US Total Vehicle Sales](https://fred.stlouisfed.org/series/TOTALSA) has business cycles that are influenced by economic conditions and advancements in vehicle technologies. <img src="data:image/png;base64,#01_intro_files/figure-html/cycles2-1.png" style="display: block; margin: auto;" /> --- # Kahoot Competition #01 To assess your understanding and retention of the topics covered so far, you will **compete in a Kahoot competition (consisting of 5 questions)**: - Go to <https://kahoot.it/> - Enter the game pin, which will be shown during class - Provide your first (preferred) and last name - Answer each question within the allocated 20-second window (**fast and correct answers provide more points**) <br> **Winning the competition involves having as many correct answers as possible AND taking the shortest duration to answer these questions.** The winner
of the competition will receive a **0.15 bonus on Assignment 01**. Good luck!!! .footnote[ <html> <hr> </html> **P.S:** The Kahoot competition will have **no impact on your grade**. It is a **fun** way of assessing your knowledge, motivating you to ask questions about topics covered that you do not have a full understanding of it, and providing me with some data that I can use to pace today's class. ] --- class: inverse, center, middle # The Data Generating Process --- ## The Idea of a Data Generating Process (DGP) - A **time series** is defined as a **sequence of observations** recorded at regular time intervals. - Any time series is generated by some kind of mechanism, which is often referred to as a **data generating process (GDP)**. For example, the hotel occupancy dataset is impacted by: - .black[.bold[season]], .black[.bold[holidays]], .black[.bold[economic conditions]], and .black[.bold[marketing campaigns]]; - .black[.bold[number of rooms]], .black[.bold[room rates]], and .black[.bold[customer satisfaction]]; - .black[.bold[weather]], .black[.bold[local events]], and .black[.bold[competition]]; and - .black[.bold[number of rooms already booked]], .black[.bold[room cancellations]], and .black[.bold[no-shows]]. - The **DGP** is the **underlying theoretical mechanism** that generates the data we observe. - Accounts for both systematic patterns (e.g., trend, seasonality) and randomness. - **But**: In real-world settings, there is often **no perfectly known** DGP. - Any formula or model we write is an approximation of the **unknowable “truth.”** --- ## Model vs. Reality — The Map Analogy for DGPs .pull-left[ - **A map ≠ the territory**: - We use maps to navigate, but they are always simplified. - Similarly, a forecast model ≠ reality—it is a *purposeful* simplification. - **Different maps for different needs**: - A tourist map highlights landmarks, while a transportation map focuses on roads to inform navigation. - Each addresses *specific* questions, just as we build different forecasting models for different objectives. ] .pull-right[  <br>  ] --- ## Why Use DGPs If They Do not Actually Exist? **(1) Guiding Principle:** - Thinking in terms of a *hypothetical* DGP helps us design or select reasonable model structures. - E.g., we incorporate domain insights: “Does our hotel occupancy data show strong seasonality?” **(2) Clarifying Assumptions:** - Even if the DGP is not known, stating assumptions (e.g., no trend, constant variance) makes our models testable and improvable. **(3) Iterative Refinement:** - As new data contradict our assumptions, we adjust our “map” of reality. - In forecasting, we continually update models to capture changing conditions. --- ## Key DGP Takeaways - **All Models Are Wrong…** - ...but some are *useful* for forecasting, planning, or decision-making. - **The DGP Is a Useful Fiction** - We talk about it to structure our thinking. - We never truly “see” it; we only see **data**. - **Practical Implication** - A good model is *close enough* to guide accurate forecasts. - Remain aware of model limitations and be ready to adapt. --- class: inverse, center, middle # "What Can (and Can not) We Forecast?" --- ## Rank these Scenarios in Terms of Forecastability
−
+
02
:
00
.panelset[ .panel[.panel-name[Description] - **Rank** each scenario (in the next tab) from **easiest** (1) to **hardest** (6) to predict. - **Submit** your ranking by clicking [here](https://www.menti.com/al4qbc8hg3jnm) ] .panel[.panel-name[Scenarios] -
**Lottery winning numbers for next weekend** -
**Sunrise time in Oxford, Ohio on January 1, 2026** -
**Maximum temperature in Oxford, Ohio tomorrow** -
**Daily electricity demand in 3 days** -
**Google's stock price in 1 week** -
**Google's stock price in 1 year** ] .panel[.panel-name[Class Results] <div style='position: relative; padding-bottom: 56.25%; padding-top: 35px; height: 0; overflow: hidden;'><iframe sandbox='allow-scripts allow-same-origin allow-presentation' allowfullscreen='true' allowtransparency='true' frameborder='0' height='285' src='https://www.mentimeter.com/app/presentation/alrufhucrqefifc266ma7ormoe6xntmw/embed' style='position: absolute; top: 0; left: 0; width: 100%; height: 100%;' width='420'></iframe></div> ] ] --- ## Perfect (or Near-Perfect) Forecasts - **Examples** 1. **Sunset Times**: - Based on precise astronomical calculations. - We can predict sunset to the exact minute, *tomorrow* or even a year from now. 2. **Tides**: - Governed by well-modeled gravitational forces of the Moon and Sun. - Highly predictable for centuries into the future. - **Why So Certain?** - These phenomena follow *deterministic* (or near-deterministic) physical laws. - Little to no stochastic “noise” in the process. --- ## Partially Predictable — Weather & Markets .pull-left[ **Weather** - *Tomorrow’s Forecast*: Quite accurate (initial conditions + physical models). - *1 Year Ahead*: Chaos and changing conditions degrade accuracy significantly. ] .pull-right[ **S&P 500** - *Tomorrow’s Close*: Some short-term signals exist, but accuracy is limited (especially if you are attempting to beat the market; accuracy is relatively high if you just want to be in the ball park of the `adjusted close`). - *1 Year Ahead*: Many unknown macro shocks, making precise forecasts very uncertain. ] --- ## Unpredictable — Lottery Numbers .pull-left[ - **No Predictable Pattern** - Draws are *engineered* to be random. - No matter how much data you collect, you can’t *outpredict* chance. - **Why “Un-forecastable”?** - The Data-Generating Process (DGP) is effectively *pure noise* by design. - No structural or deterministic component to model. ] .pull-right[  ] --- ## Relating It Back to the DGP - **Different Types of DGPs** 1. **Deterministic (or nearly so)**: Sunset times, tidal schedules. 2. **Complex & Partly Stochastic**: Weather, financial markets. 3. **Pure Randomness**: Lottery draws. - **Key Lesson** - *All* these processes have a DGP—some are more “knowable” than others. - *Forecastability* depends on how much of that DGP is deterministic vs. random and how well we can model it. --- class: inverse, center, middle # Key Forecasting Terms --- ## Forecasting - **Forecasting** is the process of using *historical data* and *patterns* to predict *future* values or events. - The objective of most **time series analyses** is to provide forecasts of future values of the time series. <img src="data:image/png;base64,#01_intro_files/figure-html/forecasting-1.png" width="1152" style="display: block; margin: auto;" /> <img src="data:image/png;base64,#../../figures/forecasting_example.png" width="100%" style="display: block; margin: auto;" /> --- ## Explanatory Forecasting - In addition to past data, **explanatory forecasting** uses *additional variables* to predict future values of the variable(s) of interest. To make the forecasts, we need to include both historical and future predictions for each of the explanatory variables. For example, - **Forecasting electricity demand** using weather forecasts, time of day, day of week, etc. .pull-left[  ] .pull-right[  ] .footnote[ <html> <hr> </html> **Image Sources:** Nixtla's demo for forecasting the remaining useful life of an engine using exogenous sensor data. See [here](https://nixtlaverse.nixtla.io/neuralforecast/docs/getting-started/introduction.html) for more details. ] --- ## Backtesting - Backtesting is the practice of **evaluating** a forecasting model by applying it to **historical data** and comparing the predictions with the actual outcomes. - A way to see how the model **would have performed** in the past. - Backtesting is the time series equivalent of a **train-test split** in ML; **not random though**. <img src="data:image/png;base64,#../../figures/ChainedWindows.gif" width="60%" style="display: block; margin: auto;" /> .footnote[ <html> <hr> </html> **Image Source:** [Chained Windows](https://github.com/Nixtla/statsforecast/blob/main/nbs/imgs/ChainedWindows.gif) from Nixtla's GitHub Repo for *statsforecast*. ] --- ## Insample vs. Out-of-Sample Metrics - **In-Sample Metrics**: - Metrics calculated on the **same data** used to train the model. - Can be misleading, as the model has already seen this data. - **Out-of-Sample Metrics**: - Metrics calculated on **data not seen by the model** during training (i.e., test/holdout/out-of-sample data). - More reliable indicators of how the model will perform on new data. --- class: inverse, center, middle # Recap --- # Summary of Main Points By now, you should be able to do the following: - Describe the role of forecasting in business. - Describe the key components of a **time-series** (**trend**, **seasonality**, **multiple seasonality**, and **cycles**). - Explain the concept of data-generating process (DGP) - Discuss limits of forecasting - Understand key forecasting terminology --- # 📝 Review and Clarification 📝 1. **Class Notes**: Take some time to revisit your class notes for key insights and concepts. 2. **Zoom Recording**: The recording of today's class will be made available on Canvas approximately 3-4 hours after the session ends. 3. **Questions**: Please don't hesitate to ask for clarification on any topics discussed in class. It's crucial not to let questions accumulate. --- # 📖 Recommended Readings 📖 #### 🐍 Python Prep - [Getting Started with Conda](https://conda.io/projects/conda/en/latest/user-guide/getting-started.html) - [Data Structures](https://docs.python.org/3/tutorial/datastructures.html) #### 🤖 LLM: Prep - [A Very Gentle Introduction to Large Language Models without the Hype](https://mark-riedl.medium.com/a-very-gentle-introduction-to-large-language-models-without-the-hype-5f67941fa59e) --- # 🎯 Assignment 🎯 - Go over your notes and complete [Assignment 01](https://miamioh.instructure.com/courses/240425/quizzes/741952) on Canvas.